88 research outputs found

    Logical-Linguistic Model and Experiments in Document Retrieval

    Get PDF
    Conventional document retrieval systems have relied on the extensive use of the keyword approach with statistical parameters in their implementations. Now, it seems that such an approach has reached its upper limit of retrieval effectiveness, and therefore, new approaches should be investigated for the development of future systems. With current advances in hardware, programming languages and techniques, natural language processing and understanding, and generally, in the field of artificial intelligence, there are now attempts being made to include linguistic processing into document retrieval systems. Few attempts have been made to include parsing or syntactic analysis into document retrieval systems, and the results reported show some improvements in the level of retrieval effectiveness. The first part of this thesis sets out to investigate further the use of linguistic processing by including translation, instead of only parsing, into a document retrieval system. The translation process implemented is based on unification categorial grammar and uses C-Prolog as the building tool. It is used as the main part of the indexing process of documents and queries into a knowledge base predicate representation. Instead of using the vector space model to represent documents and queries, we have used a kind of knowledge base model which we call logical-linguistic model. A development of a robust parser-translator to perform the translation is discussed in detail in the thesis. A method of dealing with ambiguity is also incorporated in the parser-translator implementation. The retrieval process of this model is based on a logical implication process implemented in C-Prolog. In order to handle uncertainty in evaluating similarity values between documents and queries, meta level constructs are built upon the C-Prolog system. A logical meta language, called UNIL (UNcertain Implication Language), is proposed for controlling the implication process. Using UNIL, one can write a set of implication rules and thesaurus to define the matching function of a particular retrieval strategy. Thus, we have demonstrated and implemented the matching operation between a document and a query as an inference using unification. An inference from a document to a query is done in the context of global information represented by the implication rules and the thesaurus. A set of well structured experiments is performed with various retrieval strategies on a test collection of documents and queries in order to evaluate the performance of the system. The results obtained are analysed and discussed. The second part of the thesis sets out to implement and evaluate the imaging retrieval strategy as originally defined by van Rijsbergen. The imaging retrieval is implemented as a relevance feedback retrieval with nearest neighbour information which is defined as follows. One of the best retrieval strategies from the earlier experiments is chosen to perform the initial ranking of the documents, and a few top ranked documents will be retrieved and identified as relevant or not by the user. From this set of retrieved and relevant documents, we can obtain all other unretrieved documents which have any of the retrieved and relevant documents as their nearest neighbour. These unretrieved documents have the potential of also being relevant since they are 'close' to the retrieved and relevant ones, and thus their initial similarity values to the query will be updated according to their distances from their nearest neighbours. From the updated similarity values, a new ranking of documents can be obtained and evaluated. A few sets of experiments using imaging retrieval strategy are performed for the following objectives: to search for an appropriate updating function in order to produce a new ranking of documents, to determine an appropriate nearest neighbour set, to find the relationship of the retrieval effectiveness to the size of the documents shown to the user for relevance judgement, and lastly, to find the effectiveness of a multi-stage imaging retrieval. The results obtained are analysed and discussed. Generally, the thesis sets out to define the logical-linguistic model in document retrieval and demonstrates it by building an experimental system which will be referred to as SILOL (a Simple Logical-linguistic document retrieval system). A set of retrieval strategies will be experimented with and the results obtained will be analysed and discussed

    A unified logical-linguistic indexing for search engines and question answering.

    Get PDF
    Conventional information representation models used in the search engines rely on an extensive use of keywords and their frequencies in storing and retrieving information. It is believed that such an approach has reached its upper limit of retrieval effectiveness, and therefore, new approaches should be investigated for the development of future engines which will be more effective. Logical-linguistic model is an alternative to conventional approach where logic and linguistic formalism are used in providing mechanism for computer to understand the contents of the source and deduce answers to questions. The capability of deduction is much depended on the knowledge representation framework used. We propose a unified logical-linguistic model as knowledge representation framework as a basis for indexing of documents as well as deduction capability to provide answers to queries. The approach applies semantic analysis in transforming and normalising information from natural language texts into a declarative knowledge based representation of first order predicate logic. Retrieval of relevant information can then be performed through plausible logical implication and answer to query is carried out using theorem proving technique. This paper elaborates on the model and how it is used in search engine and question answering system as one unified model

    2D text visualization for the retrieval of Malay documents

    Get PDF
    Search engine applications like Google and Yahoo present their results in the form of onedimensional linear list that usually comprise three times of the screen size per page and several number of pages. The results are displayed in the list of inconsistent declining ranks without displaying its rank values. The one-dimensional linear list display of the results data will cause classification of the results data meaningless. New queries relating to the original query are available, but its relationship strength values are not provided An application that can display all the result data in a two-dimensional text visualization within one page and circular form is proposed. The relationship strength of the result data with the query can be evaluated by finding the distance between the location of the result data to the center of the circle. Classifications that are made in the form of text and color can easily apply to the application. Malay translated Al-Quran and Malay translated hadith are used as corpuses for the application. Three functions in the application display the relationship between words and words, between words and documents, and between documents and documents. Various combinations of formulas can be used to find the values of these relationships that will be used as the rank values in the application. This, two-dimensional text visualization (TDTV), application is evaluated using two mechanisms. First, by solving a task and then, follow by answering the usability questionnaire. The results from the task section show that the variety of related documents can be retrieved in a reasonable time frame. The results from the usability questionnaire show about 75 percent of the respondents agree that the two-dimensional text visualization (TDTV) application is better than applications that display its results in one-dimensional linear list

    Twelve anchor points detection by direct point calculation

    Get PDF
    Facial features can be categorized it into three approaches; Region Approaches, Anchor Point (landmark) Approaches and Contour Approaches. Generally, anchor points approach provide more accurate and consistent representation. For this reason, anchor points approach has been chose to utilize. Although, as the experiment data sets have become larger, algorithms have become more sophisticated even if the reported recognition rates are not as high as in some earlier works. This will cause a higher complexity and computer burden. Indirectly, it also will affect the time for real time face recognition systems. Here, it is proposed the approach of calculating the points directly from the text file to detect twelve anchor points ( nose tip, mouth centre, right eye centre, left eye centre, upper nose and chin). In order to get the anchor points, points for the nose tip have to be detected first. Then the upper nose and face point is localization. Lastly, the outer and inner eyes corner is localized. An experiment has been carried out with 420 models taken from GavabDB in two positions with frontal view and variation of expressions and positions. Our results are compared with three researchers that is similar to and show that better result is obtained with a median error of the eight points is around 5.53mm

    Context aware knowledge bases for efficient contextual retrieval: design and methodologies

    Get PDF
    Contextual retrieval is a critical component for efficient usage of knowledge hidden behind the data. It is also among the most important factors for user satisfaction. It essentially comprise of two equally important parts — the retrieval mechanism and the knowledge base from which the information is retrieved. Despite the importance, context aware knowledge bases have not received much attention and thereby, limiting the efficiency of precise context aware retrieval. Such knowledge bases would not only contain information that has been efficiently stored but the knowledge contained would be context based. In other words, machines would understand the knowledge and its context rather than just storing data. This would help in efficient and context aware retrieval. The current paper proposes rules and methodologies for construction of such context aware knowledge bases. A case study to demonstrate the application of the methodology and test the efficiency of the proposed methodology has also been presented. The results indicate that knowledge bases built on these principles tend to generate more efficient and better context aware retrieval results

    Utilization of external knowledge to support answer extraction from restricted document using logical reasoning

    Get PDF
    The idea of the computer system capable of simulating understanding with respect to reading a story or passage and answering questions pertaining to it, has received increased attention within the NLP community as a means to develop and evaluate robust question answering methods. This research is concerned with the problem of generating an automated answer in the context of sophisticated knowledge representation, reasoning, and logical inferential processing. The research focused on Wh types of question to a restricted domain. External knowledge sources i.e. world knowledge (WK) and hypernyms matching procedure (HMP) were introduced. World knowledge led to the refinement of the ability of the system to extract the relevant answers. It provided a solution to the outstanding problem related to the ambiguity enclosed by anaphora and synonym words. Meanwhile, hypernyms refer to a more general or broad keywords, which was able to widen the results of the search. Thus, hypernyms matching procedure was to give a more coherent meaning of words, and eased in the process of extracting the answer based on a given question. This research found that the combined external knowledge sources on word dependencies improved the accuracy of the question answering system with respect to human performance measures

    Rule based modeling of knowledge bases: rule based construction of knowledge base models for automation/expert systems

    Get PDF
    It is critical to have a knowledge base model for efficient storage of extracted knowledge. This ensures that the knowledge is stored in a meaningful way to be used for different applications. The efficiency of the knowledge base model depends largely on the rules of construction. Knowledge represented using logico-linguistic techniques and semantic networks lack a consistent rule based knowledge model. The current paper deals with the analysis of text from the knowledge extraction, representation and semantic network phase to formulate rules which would lay foundations of a knowledge model. The developed rules seem to be promising providing a comprehensive coverage of different scenarios. The extensive coverage is an indication that the knowledge model will cater to the entire domain knowledge, thereby laying the foundations of automatic construction of efficient knowledge bases. © 2017 IEEE

    Towards skolemize clauses binding for reasoning in inference engine

    Get PDF
    The paper presents a reasoning technique for open-domain question answering (QA) system. QA system has attracted more attention to meet information needs providing users with more precise and focused retrieval answers. We proposed a skolemize clauses binding (SCB) for reasoning, along with the theorem proving to provide the basis answer extraction. QA systems employing combination of SCB and resolution theorem proving have been used to provide both satisfying and hypothetical answers. Satisfying answers are associated with ground term corresponding with questions whose logical form contains variables. Hypothetical answer is an answer which comes from the story or plot of text, and required logical thinking because it is not explicitly stated in the knowledge domain. In this case, the answer can be considered as a set of logical formula called skolemize clauses defining sufficient conditions characteristic the tuples of individuals which satisfying the query

    Accelerating Virtual Walkthrough with Visual Culling Techniques

    Get PDF
    Abstract-Virtual walkthrough application allows users to navigate and immerse in the generated 3D environment with computer graphics assist. The 3D environment requires a large amount of geometry to make it look realistic. When the number of geometry increase, the performance of the application will become slower. Consequently, it creates a conflict between the needs of realistic and real time. In this paper, we discuss the implementation of visual culling techniques such as view frustum culling, back face culling and occlusion culling in the virtual walkthrough application. We render only what we can see during the application runtime and cull away unnecessary geometry. This will accelerate the performance of the system. Without the culling techniques implemented in virtual reality application such as virtual walkthrough, the system has to allocate a large space of memory to store the geometry data. We have tested these techniques to the Ancient Malacca data. With the visual culling techniques implemented, the virtual walkthrough system can work in real time mode without scarifying realism factor

    Face recognition using local geometrical features - PCA with Euclidean classifier

    Get PDF
    The goal of this research is to get the minimum features and produce better recognition rates. Before doing the feature selection, we investigate automatic methods for detecting face anchor points with 412 3D-facial points of 60 individuals. There are 7 images per subject including views presenting light rotations and facial expressions. Each images have twelve anchor points which are Right Outer Eye, Right Inner Eye, Left Outer Eye, Left Inner Eye, Upper nose point, Nose Tip, Right Nose Base, Left Nose Base, Right Outer Face, Left Outer Face, Chin, and Upper Face. All the control points are based on the measurement on an absolute scale (mm). After all the control points have been determined, we will extract a relevant set of features. These features are classified in 3 : (1) distance of mass points, (2) angle measurements, and (3) angle measurements. There are fifty-three local geometrical features extracted from 3D points human faces to model the face for face recognition and the discriminating power calculation is to show the valuable feature among all the features. Experiment performed on the GavabDB dataset (412 faces) show that our algorithm achieved 86% of success when respectively the first rank matched
    corecore